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1 Establishing the semantic web 1: Data extraction and label assignment for web 
databases 

Jiying Wang, Fred H. Lochovsky 

May 2003 Proceedings of the 12th international conference on World Wide Web 

Additional Information: full citation , abstract , references , citings , index 
terms 



Full text available: gpdf(651. 74 KB) 



Many tools have been developed to help users query, extract and integrate data from web 
pages generated dynamically from databases, i.e., from the Hidden Web. A key prerequisite 
for such tools is to obtain the schema of the attributes of the retrieved data. In this paper, 
we describe a system called, DeLa, which reconstructs (part of) a "hidden" back-end web 
database. It does this by sending queries through HTML forms, automatically generating 
regular expression wrappers to extract ... 

Keywords: HTML forms, automatic wrapper induction, data annotation, hidden web, 
information integration, web information extraction 



Data integrity: Web application security assessment by fault injection and behavior 
monitoring 

Yao-Wen Huang, Shih-Kun Huang, Tsung-Po Lin, Chung-Hung Tsai 

May 2003 Proceedings of the 12th international conference on World Wide Web 

Additional Information: full citation , abstract , references , citings , index 
terms 



Full text available: 



As a large and complex application platform, the World Wide Web is capable of delivering a 
broad range of sophisticated applications. However, many Web applications go through 
rapid development phases with extremely short turnaround time, making it difficult to 
eliminate vulnerabilities. Here we analyze the design of Web application security 
assessment mechanisms in order to identify poor coding practices that render Web 
applications vulnerable to attacks such as SQL injection and cross-site scr ... 

Keywords: black-box testing, complete crawling, fault injection, security assessment, web 
application testing 



3 Computational aspects of resilient data extraction from semistructured sources 
(extended abstract) 
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Hasan Davulcu, Guizhen Yang, Michael Kifer, I. V. Ramakrishnan 

May 2000 Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on 
Principles of database systems 

Full text available: fia odf(259.33 KB) Additiona! Information: full citation , abstract, references , citings, index 
^ terms 

Automatic data extraction from semistructured sources such as HTML pages is rapidly 
growing into a problem of significant importance, spurred by the growing popularity of the 
so called "shopbots" that enable end users to compare prices of goods and other services at 
various web sites without having to manually browse and fill out forms at each one of these 
sites.The main problem one has to contend with when designing data extraction techniques 
is that the contents of ... 

Efficient Web form entry on PDAs 

Oliver Kaljuvee, Orkut Buyukkokten, Hector Garcia-Molina, Andreas Paepcke 

April 2001 Proceedings of the 10th international conference on World Wide Web 

Full text available: ^ pdf(398.94 KB) Additional Information: full citation , references , citings , index terms 



Keywords: PDA, WAP, forms, mobile computing, wireless access 



5 Efficient web browsing on handheld devices using page and form summarization 
January 2002 ACM Transactions on Information Systems (TOIS), volume 20 issue l 

r- .. * . i ui 0 MtA A-f kad\ Additional Information: full citation , abstract , references , citings , index 

Full text available: T Spdf(4.47 MB) a -' 

terms , review 

We present a design and implementation for displaying and manipulating HTML pages on 
small handheld devices such as personal digital assistants (PDAs), or cellular phones. We 
introduce methods for summarizing parts of Web pages and HTML forms. Each Web page is 
broken into text units that can each be hidden, partially displayed, made fully visible, or 
summarized. A variety of methods are introduced that summarize the text units. In 
addition, HTML forms are also summarized by displaying just the t ... 

Keywords: PDA, Personal digital assistant, WAP, WML, forms, handheld computers, mobile 
computing, summarization, ubiquitous computing, wireless computing 

6 Web browsing in a wireless environment: disconnected and asynchronous operation in 
ARTour Web Express 

Henry Chang, Carl Tait, Norman Cohen, Moshe Shapiro, Steve Mastrianni, Rick Floyd, Barron 
Housel, David Lindquist 

September 1997 Proceedings of the 3rd annual ACM/IEEE international conference on 
Mobile computing and networking 

Full text available: ^pdf(1.50 MB) Additional Information: full citation , references , citings , index terms 



7 Information gathering in the World-Wide Web: the W3QL query language and the 
W3QS system 

David Konopnicki, Oded Shmueli 

December 1998 ACM Transactions on Database Systems (TODS), Volume 23 issue 4 

Full text available* Ddfd 36 MB) Additional Information: full citation , abstract , references , citings , index 
' l2d terms 

The World Wide Web (WWW) is a fast growing global information resource. It contains an 
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enormous amount of information and provides access to a variety of services. Since there is 
no central control and very few standards of information organization or service offering, 
searching for information and services is a widely recognized problem. To some degree this 
problem is solved by "search services/' also known as "indexers," such as Lycos, AltaVista, 
Yahoo, and others. ... 

Keywords: CGI, FORMS, HTML, HTTP, PERL, World-Wide Web, query language, query 
system 



8 Semistructured and structured data in the Web: going back and forth Q 
Paolo Atzeni, Giansalvatore Mecca, Paolo Merialdo 
December 1997 ACM SIGMOD Record, Volume 26 issue 4 

Full text available: ^ pdf(848.60 KB) Additional Information: full citation , citings , index terms 



9 Model-driven development of Web applications: the AutoWeb system Q 
Piero Fraternali, Paolo Paolini 

October 2000 ACM Transactions on Information Systems (TOIS), Volume 18 issue 4 



This paper describes a methodology for the development of WWW applications and a tool 
environment specifically tailored for the methodology. The methodology and the 
development environment are based upon models and techniques already used in the 
hypermedia, information systems, and software engineering fields, adapted and blended in 
an original mix. The foundation of the proposal is the conceptual design of WWW 
applications, using HDM-lite, a notation for the specification of structure, nav ... 

Keywords: HTML, WWW, application, development, intranet, modeling 



10 Timeline to efficiency Q 
Tammy Hohlt, Kristina Cunningham 

October 2004 Proceedings of the 32nd annual ACM SIGUCCS conference on User 
services 

Full text available: ^j pdfd 60.77 KB) Additional Information: full citation , abstract , index terms 

When you are in a position that relies on student employees as the majority of your 
workforce, there are many issues that you must handle and it is imperative to put the most 
efficient system into place. At the University of Missouri-Columbia, IAT Services, our 
organization employs approximately 175 students per semester. The processes that we 
have in place for hiring, training and performance evaluations are continuously being 
revamped and improved upon by our staff and employees. We have a ... 

Keywords: computing sites, customer service, hiring and evaluations, interviewing, master 
calendar, mentoring, teamwork, timeline, training 



11 Implementing shared manufacturing services on the World-Wide Web 

J. W. Erkes, K. B. Kenny, J. W. Lewis, B. D. Sarachan, M. W. Sobolewski, R. N. Sum 
February 1996 Communications of the ACM, volume 39 issue 2 

Full text available: ^pdf(404.10 KB) Additional Information: full citation , references , citings , index terms 



Full text available: |j| pdf(6.94 MB) 



Additional Information: full citation , abstract , references , citings , index 
terms 
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12 Research sessions: Web. XML and IR: Understanding Web query interfaces: best- 
effort parsing with hidden syntax 

Zhen Zhang, Bin He, Kevin Chen-Chuan Chang 

June 2004 Proceedings of the 2004 ACM SIGMOD international conference on 
Management of data 

Full text available: ^ pdf(431 .29 KB) Additional Information: full citation , abstract , references , citings 

Recently, the Web has been rapidly "deepened" by many searchable databases online, 
where data are hidden behind query forms. For modelling and integrating Web databases, 
the very first challenge is to understand what a query interface says- or what query 
capabilities a source supports. Such automatic extraction of interface semantics is 
challenging, as query forms are created autonomously. Our approach builds on the 
observation that, across myriad sources, query forms seem to reveal some ... 

13 Supporting the writing of reports in a hierarchical organization 
Andreas Girgensohn 

March 1999 ACM SIGSOFT Software Engineering Notes , Proceedings of the 

international joint conference on Work activities coordination and 

collaboration, Volume 24 Issue 2 
Full text available: ^pdf(!31 MB) Additional Information: full citation , abstract , references , index terms 

In many hierarchical companies, reports from several independent groups must be merged 
to form a single, company-wide report. This paper describes a process and system for 
creating and structuring such reports and for propagating contributions up the organization. 
The system has been in regular use, in-house, by about 30 users for over a year to create 
monthly status reports. Our experiences indicate that it is possible to change a monthly 
reporting practice so that the system is easy to use, im ... 

Keywords: World Wide Web, collaborative writing, corporate memory, hierarchical 
organizations, report generation, user feedback 



14 Spoken dialogue technology: enabling the conversational user interface 
Michael F. McTear 

March 2002 ACM Computing Surveys (CSUR), Volume 34 issue l 

.. * a . ui 0i .rmo^ cn i/ D \ Additional Information: full citation , abstract , references , citings , index 

Full text available: TO pdf(987.69 KB) : 

l£3 "^ terms , review 

Spoken dialogue systems allow users to interact with computer-based applications such as 
databases and expert systems by using natural spoken language. The origins of spoken 
dialogue systems can be traced back to Artificial Intelligence research in the 1950s 
concerned with developing conversational interfaces. However, it is only within the last 
decade or so, with major advances in speech technology, that large-scale working systems 
have been developed and, in some cases, introduced into commerc ... 

Keywords: Dialogue management, human computer interaction, language generation, 
language understanding, speech recognition, speech synthesis 



15 Fast detection of communication patterns in distributed executions 
Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced Studies 
on Collaborative research 

Full text available: ^ pdf(4.21 MB) Additional Information: full citation , abstract , references , index terms 

Understanding distributed applications is a tedious and difficult task. Visualizations based on 
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process-time diagrams are often used to obtain a better understanding of the execution of 
the application. The visualization tool we use is Poet, an event tracer developed at the 
University of Waterloo. However, these diagrams are often very complex and do not provide 
the user with the desired overview of the application. In our experience, such tools display 
repeated occurrences of non-trivial commun ... 

16 Information retrieval on the web j 
Mei Kobayashi, Koichi Takeda 

June 2000 ACM Computing Surveys (CSUR), Volume 32 issue 2 

Full text available- S pdf(21 3 89 KB) Addit ' onal Information: full citation , abstract , references , citings , index 

" terms 

In this paper we review studies of the growth of the Internet and technologies that are 
useful for information search and retrieval on the Web. We present data on the Internet 
from several different sources, e.g., current as well as projected number of users, hosts, 
and Web sites. Although numerical figures vary, overall trends cited by the sources are 
consistent and point to exponential growth in the past and in the coming decade. Hence it is 
not surprising that about 85% of Internet user ... 

Keywords: Internet, World Wide Web, clustering, indexing, information retrieval, 
knowledge management, search engine 



17 Generating HTML sources with TFE enhanced SQL 
Toshiyuki Seto, Takuhiro Nagafuji, Motomichi Toyama 

April 1997 Proceedings of the 1997 ACM symposium on Applied computing 
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